Chinese Sentence Generation in a Knowledge-Based Machine Translation System
نویسندگان
چکیده
This paper presents a technique for generating Chinese sentences from the Interlingua expressions used in the KANT knowledge-based machine translation system. Chinese sentences are generated directly from the semantic representation using a unificationbased generation formalization which takes advantage of certain linguistic features of Chinese. Direct generation from the semantic form eliminates the need for an intermediate syntactic structure, thus simplifying the generation procedure. The generation algorithm is top-down, data-driven and recursive. The descriptive nature of the pseudo-unification grammar formalism used in KANT allows the grammar developer to write very straightforward semantic grammar rules. We also discuss some of the crucial problems in Chinese language generation, and describe how they can be dealt with in our framework. This technique has been implemented in a prototype Chinese sentence generation system for KANT. Some implementation details and experimental results concerning the prototype are presented at the end of this paper.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملSentence Generation by Analogy: Towards the Construction of A Quasi-parallel Corpus for Chinese-Japanese
Parallel corpora are indispensable resources in datadriven approaches to machine translation: statistical and example-based. The major problem inherent in developing a Chinese-Japanese machine translation system is the lack of bilingual parallel corpus. We have implemented several based on proportional analogy techniques to produce new quasi-parallel sentences using monolingual data, collected ...
متن کاملTowards a discourse relation-aware approach for Chinese-English machine translation
Translation of discourse relations is one of the recent efforts of incorporating discourse information to statistical machine translation (SMT). While existing works focus on disambiguation of ambiguous discourse connectives, or transformation of discourse trees, only explicit discourse relations are tackled. A greater challenge exists in machine translation of Chinese, since implicit discourse...
متن کاملAutomatic sentence segmentation and punctuation prediction for spoken language translation
This paper studies the impact of automatic sentence segmentation and punctuation prediction on the quality of machine translation of automatically recognized speech. We present a novel sentence segmentation method which is specifically tailored to the requirements of machine translation algorithms and is competitive with state-of-the-art approaches for detecting sentence-like units. We also des...
متن کاملSemantic MMT Model Based on Hierarchical Network of Concepts in Chinese-English MT
To study the generation of the semantic tree of Chinese sentence in Chinese-English Machine translation (MT), a new semantic-analysis model of Chinese multiplebranched and multiple-labeled tree (MMT) based on the hierarchical network of concepts (HNC) is proposed. Supported by word and rule knowledge-base of HNC, the model executed the semantic analysis using static and dynamic labels as a comp...
متن کامل